Learning Taxonomies by Dependence Maximization
نویسندگان
چکیده
We introduce a family of unsupervised algorithms, numerical taxonomy clustering, to simultaneously cluster data, and to learn a taxonomy that encodes the relationship between the clusters. The algorithms work by maximizing the dependence between the taxonomy and the original data. The resulting taxonomy is a more informative visualization of complex data than simple clustering; in addition, taking into account the relations between different clusters is shown to substantially improve the quality of the clustering, when compared with state-ofthe-art algorithms in the literature (both spectral clustering and a previous dependence maximization approach). We demonstrate our algorithm on image and text data.
منابع مشابه
Feature Selection via Dependence Maximization
We introduce a framework of feature selection based on dependence maximization between the selected features and the labels of an estimation problem, using the Hilbert-Schmidt Independence Criterion. The key idea is that good features should be highly dependent on the labels. Our approach leads to a greedy procedure for feature selection. We show that a number of existing feature selectors are ...
متن کاملLearning to integrate web taxonomies
We investigate machine learning methods for automatically integrating objects from different taxonomies into a master taxonomy. This problem is not only currently pervasive on the Web, but is also important to the emerging Semantic Web. A straightforward approach to automating this process would be to build classifiers through machine learning and then use these classifiers to classify objects ...
متن کاملLearning Co-Substructures by Kernel Dependence Maximization
Modeling associations between items in a dataset is a problem that is frequently encountered in data and knowledge mining research. Most previous studies have simply applied a predefined fixed pattern for extracting the substructure of each item pair and then analyzed the associations between these substructures. Using such fixed patterns may not, however, capture the significant association. W...
متن کاملInfluence Maximization in Social Networks using Learning Automata
Influence maximization problem is one of the challenges in online social networks. This problem refers to finding a small set of members of a social network, by activation of whichinformation propagation can be maximized using one of the propagation models such as independent cascade model. For the maximization problem, the greedy algorithm has beenpresented which isclose to optimal response by...
متن کاملEnergy Scheduling in Power Market under Stochastic Dependence Structure
Since the emergence of power market, the target of power generating utilities has mainly switched from cost minimization to revenue maximization. They dispatch their power energy generation units in the uncertain environment of power market. As a result, multi-stage stochastic programming has been applied widely by many power generating agents as a suitable tool for dealing with self-scheduling...
متن کامل